Is prioritized sweeping the better episodic control?
نویسنده
چکیده
Episodic control has been proposed as a third approach to reinforcement learning, besides model-free and model-based control, by analogy with the three types of human memory. i.e. episodic, procedural and semantic memory. But the theoretical properties of episodic control are not well investigated. Here I show that in deterministic tree Markov decision processes, episodic control is equivalent to a form of prioritized sweeping in terms of sample efficiency as well as memory and computation demands. For general deterministic and stochastic environments, prioritized sweeping performs better even when memory and computation demands are restricted to be equal to those of episodic control. These results suggest generalizations of prioritized sweeping to partially observable environments, its combined use with function approximation and the search for possible implementations of prioritized sweeping in brains.
منابع مشابه
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time
We present a new algorithm, Prioritized Sweeping, for e cient prediction and control of stochastic Markov systems. Incremental learning methods such as Temporal Di erencing and Qlearning have fast real time performance. Classical methods are slower, but more accurate, because they make full use of the observations. Prioritized Sweeping aims for the best of both worlds. It uses all previous expe...
متن کاملMemory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping
[email protected] NE43-771 MIT AI Lab. 545 Technology Square Cambridge MA 02139 We present a new algorithm, Prioritized Sweeping, for efficient prediction and control of stochastic Markov systems. Incremental learning methods such as Temporal Differencing and Q-Iearning have fast real time performance. Classical methods are slower, but more accurate, because they make full use of the observations....
متن کاملGeneralized Prioritized Sweeping
Prioritized sweeping is a model-based reinforcement learning method that attempts to focus an agent’s limited computational resources to achieve a good estimate of the value of environment states. To choose effectively where to spend a costly planning step, classic prioritized sweeping uses a simple heuristic to focus computation on the states that are likely to have the largest errors. In this...
متن کاملPrioritized Sweeping Reinforcement Learning Based Routing for MANETs
In this paper, prioritized sweeping confidence based dual reinforcement learning based adaptive network routing is investigated. Shortest Path routing is always not suitable for any wireless mobile network as in high traffic conditions, shortest path will always select the shortest path which is in terms of number of hops, between source and destination thus generating more congestion. In prior...
متن کاملA Reinforcement Learning Based Discrete Supplementary Control for Power System Transient Stability Enhancement
This paper proposes an application of a Reinforcement Learning (RL) method to the control of a dynamic brake aimed to enhance power system transient stability. The control law of the resistive brake is in the form of switching strategies. In particular, the paper focuses on the application of a model based RL method, known as prioritized sweeping, a method proven to be suitable in applications ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.06677 شماره
صفحات -
تاریخ انتشار 2017